AITopics | Innlandet

Collaborating Authors

Innlandet

PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models

Neural Information Processing SystemsFeb-17-2026, 14:31:55 GMT

Models (LLMs) can sometimes surpass those annotated by experts, such as journalists, according to human evaluations. However, there is limited research on whether these generic summaries meet the individual needs of ordinary people.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > Norway > Eastern Norway > Innlandet > Lillehammer (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ACADATA: Parallel Dataset of Academic Data for Machine Translation

Lacunza, Iñaki, Gilabert, Javier Garcia, Fornaciari, Francesca De Luca, Aula-Blasco, Javier, Gonzalez-Agirre, Aitor, Melero, Maite, Villegas, Marta

arXiv.org Artificial IntelligenceOct-16-2025

We present ACADATA, a high-quality parallel dataset for academic translation, that consists of two subsets: ACAD-TRAIN, which contains approximately 1.5 million author-generated paragraph pairs across 96 language directions and ACAD-BENCH, a curated evaluation set of almost 6,000 translations covering 12 directions. To validate its utility, we fine-tune two Large Language Models (LLMs) on ACAD-TRAIN and benchmark them on ACAD-BENCH against specialized machine-translation systems, general-purpose, open-weight LLMs, and several large-scale proprietary models. Experimental results demonstrate that fine-tuning on ACAD-TRAIN leads to improvements in academic translation quality by +6.1 and +12.4 d-BLEU points on average for 7B and 2B models respectively, while also improving long-context translation in a general domain by up to 24.9% when translating out of English. The fine-tuned top-performing model surpasses the best propietary and open-weight models on academic translation domain. By releasing ACAD-TRAIN, ACAD-BENCH and the fine-tuned models, we provide the community with a valuable resource to advance research in academic domain and long-context translation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.12621

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
Europe > Slovenia (0.04)
(25 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Government (0.92)
Leisure & Entertainment > Games (0.67)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models

Neural Information Processing SystemsOct-10-2025, 13:55:31 GMT

dataset, generic summary, personalsum, (15 more...)

Neural Information Processing Systems

Country: Europe > Norway > Eastern Norway > Innlandet > Lillehammer (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HistoryBankQA: Multilingual Temporal Question Answering on Historical Events

Mandal, Biswadip, Khandelwal, Anant, Gupta, Manish

arXiv.org Artificial IntelligenceSep-17-2025

Temporal reasoning about historical events is a critical skill for NLP tasks like event extraction, historical entity linking, temporal question answering, timeline summarization, temporal event clustering and temporal natural language inference. Yet efforts on benchmarking temporal reasoning capabilities of large language models (LLMs) are rather limited. Existing temporal reasoning datasets are limited in scale, lack multilingual coverage and focus more on contemporary events. To address these limitations, we present HistoryBank, a multilingual database of 10M+ historical events extracted from Wikipedia timeline pages and article infoboxes. Our database provides unprecedented coverage in both historical depth and linguistic breadth with 10 languages. Additionally, we construct a comprehensive question answering benchmark for temporal reasoning across all languages. This benchmark covers a diverse set of 6 temporal QA reasoning tasks, and we evaluate a suite of popular language models (LLaMA-3-8B, Mistral-7B, Gemma-2-9b, Qwen3-8B, GPT4o) to assess their performance on these tasks. As expected GPT4o performs best across all answer types and languages; Gemma-2 outperforms the other small language models. Our work aims to provide a comprehensive resource for advancing multilingual and temporally-aware natural language understanding of historical events. To facilitate further research, we will make our code and datasets publicly available upon acceptance of this paper.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.1272

Country:

Europe > Russia (0.15)
Asia > Russia (0.15)
Asia > Indonesia (0.14)
(96 more...)

Genre: Research Report (0.81)

Industry:

Leisure & Entertainment > Sports (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Media (0.68)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need

Dedhia, Bhishma, Kansal, Yuval, Jha, Niraj K.

arXiv.org Artificial IntelligenceSep-3-2025

Language models traditionally used for cross-domain generalization have recently demonstrated task-specific reasoning. However, their top-down training approach on general corpora is insufficient for acquiring abstractions needed for deep domain expertise. This may require a bottom-up approach that acquires expertise by learning to compose simple domain concepts into more complex ones. A knowledge graph (KG) provides this compositional structure, where domain primitives are represented as head-relation-tail edges and their paths encode higher-level concepts. We present a task generation pipeline that synthesizes tasks directly from KG primitives, enabling models to acquire and compose them for reasoning. We fine-tune language models on the resultant KG-grounded curriculum to demonstrate domain-specific superintelligence. While broadly applicable, we validate our approach in medicine, where reliable KGs exist. Using a medical KG, we curate 24,000 reasoning tasks paired with thinking traces derived from diverse medical primitives. We fine-tune the QwQ-32B model on this curriculum to obtain QwQ-Med-3 that takes a step towards medical superintelligence. We also introduce ICD-Bench, an evaluation suite to quantify reasoning abilities across 15 medical domains. Our experiments demonstrate that QwQ-Med-3 significantly outperforms state-of-the-art reasoning models on ICD-Bench categories. Further analysis reveals that QwQ-Med-3 utilizes acquired primitives to widen the performance gap on the hardest tasks of ICD-Bench. Finally, evaluation on medical question-answer benchmarks shows that QwQ-Med-3 transfers acquired expertise to enhance the base model's performance. While the industry's approach to artificial general intelligence (AGI) emphasizes broad expertise, we envision a future in which AGI emerges from the composable interaction of efficient domain-specific superintelligent agents.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2507.13966

Country:

Europe > France (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

InPars+: Supercharging Synthetic Data Generation for Information Retrieval Systems

Krastev, Matey, Hamar, Miklos, Toapanta, Danilo, Brouwers, Jesse, Lei, Yibin

arXiv.org Artificial IntelligenceAug-20-2025

This work revisits and extends synthetic query generation pipelines for Neural Information Retrieval (NIR) by leveraging the InPars Toolkit, a reproducible, end-to-end framework for generating training data using large language models (LLMs). We first assess the reproducibility of the original InPars, InPars-V2, and Promptagator pipelines on the SciFact benchmark and validate their effectiveness using open-source reranker and generator models. Building on this foundation, we introduce two key extensions to the pipeline: (1) fine-tuning a query generator LLM via Contrastive Preference Optimization (CPO) to improve the signal quality in generated queries, and (2) replacing static prompt templates with dynamic, Chain-of-Thought (CoT) optimized prompts using the DSPy framework. Our results show that both extensions reduce the need for aggressive filtering while improving retrieval performance. All code, models, and synthetic datasets are publicly released to support further research at: \href{https://github.com/danilotpnta/IR2-project}{this https URL}.

information retrieval, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.1393

Country:

Europe > Netherlands > North Holland > Amsterdam (0.77)
Europe > Norway > Eastern Norway > Innlandet > Hamar (0.40)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast and Generalizable parameter-embedded Neural Operators for Lithium-Ion Battery Simulation

Panahi, Amir Ali, Luder, Daniel, Wu, Billy, Offer, Gregory, Sauer, Dirk Uwe, Li, Weihan

arXiv.org Artificial IntelligenceAug-12-2025

Reliable digital twins of lithium-ion batteries must achieve high physical fidelity with sub-millisecond speed. In this work, we benchmark three operator-learning surrogates for the Single Particle Model (SPM): Deep Operator Networks (DeepONets), Fourier Neural Operators (FNOs) and a newly proposed parameter-embedded Fourier Neural Operator (PE-FNO), which conditions each spectral layer on particle radius and solid-phase diffusivity. Models are trained on simulated trajectories spanning four current families (constant, triangular, pulse-train, and Gaussian-random-field) and a full range of State-of-Charge (SOC) (0 % to 100 %). DeepONet accurately replicates constant-current behaviour but struggles with more dynamic loads. The basic FNO maintains mesh invariance and keeps concentration errors below 1 %, with voltage mean-absolute errors under 1.7 mV across all load types. Introducing parameter embedding marginally increases error, but enables generalisation to varying radii and diffusivities. PE-FNO executes approximately 200 times faster than a 16-thread SPM solver. Consequently, PE-FNO's capabilities in inverse tasks are explored in a parameter estimation task with Bayesian optimisation, recovering anode and cathode diffusivities with 1.14 % and 8.4 % mean absolute percentage error, respectively, and 0.5918 percentage points higher error in comparison with classical methods. These results pave the way for neural operators to meet the accuracy, speed and parametric flexibility demands of real-time battery management, design-of-experiments and large-scale inference. PE-FNO outperforms conventional neural surrogates, offering a practical path towards high-speed and high-fidelity electrochemical digital twins.

artificial intelligence, machine learning, operator, (19 more...)

arXiv.org Artificial Intelligence

2508.08087

Country:

Europe > Germany (0.14)
Europe > United Kingdom > North Sea > Southern North Sea (0.04)
Europe > Norway > Eastern Norway > Innlandet > Hamar (0.04)

Genre: Research Report (0.40)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diagnostic-free onboard battery health assessment

Che, Yunhong, Lam, Vivek N., Rhyu, Jinwook, Schaeffer, Joachim, Kim, Minsu, Bazant, Martin Z., Chueh, William C., Braatz, Richard D.

arXiv.org Artificial IntelligenceMar-10-2025

Diverse usage patterns induce complex and variable aging behaviors in lithium-ion batteries, complicating accurate health diagnosis and prognosis. Separate diagnostic cycles are often used to untangle the battery's current state of health from prior complex aging patterns. However, these same diagnostic cycles alter the battery's degradation trajectory, are time-intensive, and cannot be practically performed in onboard applications. In this work, we leverage portions of operational measurements in combination with an interpretable machine learning model to enable rapid, onboard battery health diagnostics and prognostics without offline diagnostic testing and the requirement of historical data. We integrate mechanistic constraints within an encoder-decoder architecture to extract electrode states in a physically interpretable latent space and enable improved reconstruction of the degradation path. The health diagnosis model framework can be flexibly applied across diverse application interests with slight fine-tuning. We demonstrate the versatility of this model framework by applying it to three battery-cycling datasets consisting of 422 cells under different operating conditions, highlighting the utility of an interpretable diagnostic-free, onboard battery diagnosis and prognosis model.

battery, dataset, prediction, (17 more...)

arXiv.org Artificial Intelligence

2503.07383

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Health & Medicine (1.00)
Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Monitoring snow avalanches from SAR data with deep learning

Bianchi, Filippo Maria, Grahn, Jakob

arXiv.org Artificial IntelligenceFeb-25-2025

Snow avalanches present significant risks to human life and infrastructure, particularly in mountainous regions, making effective monitoring crucial. Traditional monitoring methods, such as field observations, are limited by accessibility, weather conditions, and cost. Satellite-borne Synthetic Aperture Radar (SAR) data has become an important tool for large-scale avalanche detection, as it can capture data in all weather conditions and across remote areas. However, traditional processing methods struggle with the complexity and variability of avalanches. This chapter reviews the application of deep learning for detecting and segmenting snow avalanches from SAR data. Early efforts focused on the binary classification of SAR images, while recent advances have enabled pixel-level segmentation, providing greater accuracy and spatial resolution. A case study using Sentinel-1 SAR data demonstrates the effectiveness of deep learning models for avalanche segmentation, achieving superior results over traditional methods. We also present an extension of this work, testing recent state-of-the-art segmentation architectures on an expanded dataset of over 4,500 annotated SAR images. The best-performing model among those tested was applied for large-scale avalanche detection across the whole of Norway, revealing important spatial and temporal patterns over several winter seasons.

avalanch, detection, segmentation, (12 more...)

arXiv.org Artificial Intelligence

2502.18157

Country:

Europe > Switzerland (0.04)
Europe > Norway > Eastern Norway > Innlandet > Hamar (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Austria (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Real-world Troublemaker: A 5G Cloud-controlled Track Testing Framework for Automated Driving Systems in Safety-critical Interaction Scenarios

Zhang, Xinrui, Xiong, Lu, Zhang, Peizhi, Huang, Junpeng, Ma, Yining

arXiv.org Artificial IntelligenceFeb-23-2025

--Track testing plays a critical role in the safety evaluation of autonomous driving systems (ADS), as it provides a real-world interaction environment. However, the inflexibility in motion control of object targets and the absence of intelligent interactive testing methods often result in pre-fixed and limited testing scenarios. T o address these limitations, we propose a novel 5G cloud-controlled track testing framework, Real-world Troublemaker . This framework overcomes the rigidity of traditional pre-programmed control by leveraging 5G cloud-controlled object targets integrated with the Internet of Things (IoT) and vehicle teleoperation technologies. Unlike conventional testing methods that rely on pre-set conditions, we propose a dynamic game strategy based on a quadratic risk interaction utility function, facilitating intelligent interactions with the vehicle under test (VUT) and creating a more realistic and dynamic interaction environment. The proposed framework has been successfully implemented at the T ongji University Intelligent Connected V ehicle Evaluation Base. Field test results demonstrate that Troublemaker can perform dynamic interactive testing of ADS accurately and effectively. Compared to traditional methods, Troublemaker improves scenario reproduction accuracy by 65.2%, increases the diversity of interaction strategies by approximately 9.2 times, and enhances exposure frequency of safety-critical scenarios by 3.5 times in unprotected left-turn scenarios. Index T erms --Automated driving systems, track testing, 5G, cloud-controlled object targets, interaction scenarios. HE safety of automated driving systems (ADS) must be ensured prior to their practical implementation, which requires a well-established testing framework [1]. Existing test standards, such as ISO 26262 [2], UN R157 [3], and UN R171 [4], are not sufficient to comprehensively evaluate ADS. According to the driving automation levels defined by SAE J3016 from SAE International, a high-level ADS (i.e., Level 3 or higher) is expected to perform driving tasks independently and autonomously, with the driver no longer retaining continuous control over vehicle movement [5]. While ADS has already been deployed in various countries and regions, numerous ADS traffic incidents highlight that safety testing for high-level ADS remains a critical technical challenge. In comparison to traditional vehicles and advanced driver assistance systems (ADAS), high-level ADS testing faces significant transformations and challenges, particularly in terms of both test subjects and requirements.

scenario, troublemaker, vehicle, (15 more...)

arXiv.org Artificial Intelligence

2502.14574

Country:

Europe > Switzerland > Geneva > Geneva (0.04)
Europe > Norway > Eastern Norway > Innlandet > Hamar (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback